Midterm

Author

Giuliet Kibler

Introduction

The National Health and Nutrition Examination Survey is a survey, typically conducted over a two-year period, to estimate the dietary intake over the 24-hour period prior to the interview of Americans 1 year or older. This particular dataset is a combination of data collected in the 2017-2018 cycle and 2019-March 2020 since the NHANES program was suspended in March of 2020 due to the COVID-19 pandemic. The dietary interview component of this survey is called “What We Eat in America” (WWEIA) and data is collected using the USDA’s Automated Multiple Pass Method (AMPM). All participants are eligible for two survey interviews, the first of which is recorded in person at the Mobile Examination Center, and the second is conducted over the phone 3 to 10 days later. This data set includes dietary information from the first interview and is a log of the total energy and nutrient intakes from foods and beverages within the previous 24-hours. Of particular interest in this dataset is the relationship between pre-pandemic macronutrient dietary intention and true intake. Additionally, there is interest in assessing the relationship between dietary intake and the B vitamins intake. B vitamins are cofactors for many cellular pathways, including cellular metabolism and synthesis of DNA and RNA, but are not stored by the body, so it is critical to replenish them daily through foods and supplements (Hanna et al, 2022). Therefore, this analysis is to assess if Americans 1 year or older are eating their intended macronutrient diet and if their intake is associated with B vitamin levels pre-pandemic.

Methods

The P_DR1TOT dataset for 2017-March 2020 was downloaded from the CDC’s NHANES records of dietary data. This is a dataset from the WWEIA day 1 interviews, conducted pre-pandemic, and includes total dietary intake of participants.

Data variables of interest include 6 variables of special diets, referred to here as intended diet, 5 energy (caloric) and macronutrient variables, and B vitamins 1, 2, and 6. These variables were extracted from the dataset and relabeled to be more informative. 12 intended diets were recorded in separate variables as numbers 1-12 for yes to that diet or missing for no. These variables were altered to 1 for yes and 0 for no. Since low calorie and high calorie diets are labeled separately, a new variable for diet was created where low calorie is 0, high calorie is 1, and neither is 2.

Correlation between intended diet and dietary intake was assessed using summary statistics and box plots.

The proportion of participants below the recommended B vitamins intake levels for men were reported. Correlation between dietary intake, as well as caloric diet type, and B vitamin levels was assessed using scatter plots and linear fitted models.

[1] "Summary of Extracted Data"
# A tibble: 14,300 × 15
   Calories Low_Calorie High_Calorie Sugar Low_Sugar Carbohydrate
      <dbl>       <dbl>        <dbl> <dbl>     <dbl>        <dbl>
 1     1402           0            0  73.4         0         188.
 2     1046           0            0  27.9         0         122.
 3     1926           0            0 157.          0         247.
 4     1698           1            0  94.2         0         218.
 5     1251           0            0  84.8         0         160.
 6     1973           0            0 134.          0         273.
 7     2310           0            0  85           0         208.
 8       NA           0            0  NA           0          NA 
 9     1403           0            0 163.          0         266.
10     2385           0            0  60.8         0         305.
# ℹ 14,290 more rows
# ℹ 9 more variables: Low_Carbohydrate <dbl>, Fat <dbl>, Low_Fat <dbl>,
#   Protein <dbl>, High_Protein <dbl>, B1 <dbl>, B2 <dbl>, B6 <dbl>,
#   Calorie_diet <dbl>

Preliminary Results

Investigate correlation between intended diet and dietary intake

[1] "Summary of Sugar Intake"


|Low_Sugar |      Mean| Median|  Min|    Max| Count|
|:---------|---------:|------:|----:|------:|-----:|
|0         | 105.68458|  90.89| 0.00| 931.16| 14226|
|1         |  81.42301|  62.24| 5.16| 414.14|    74|
[1] "Summary of Carbohydrate Intake"


|Low_Carbohydrate |     Mean|  Median|  Min|     Max| Count|
|:----------------|--------:|-------:|----:|-------:|-----:|
|0                | 239.5814| 219.080| 0.00| 1586.24| 14143|
|1                | 178.5371| 165.395| 9.55|  673.39|   157|
[1] "Summary of Fat Intake"


|Low_Fat |     Mean| Median|  Min|    Max| Count|
|:-------|--------:|------:|----:|------:|-----:|
|0       | 81.03713| 71.805| 0.00| 567.96| 14154|
|1       | 80.43219| 72.620| 5.72| 253.70|   146|
[1] "Summary of Protein Intake"


|High_Protein |      Mean| Median|   Min|    Max| Count|
|:------------|---------:|------:|-----:|------:|-----:|
|0            |  72.16154|  64.72|  0.00| 545.20| 14261|
|1            | 108.33923| 100.61| 15.03| 309.18|    39|

Summary Table for Caloric Diets



| Calorie_diet|     Mean| Median| Min|   Max| Count|
|------------:|--------:|------:|---:|-----:|-----:|
|            0| 1968.596|   1834| 100|  7375|   799|
|            1| 2669.773|   2528| 553|  7632|    46|
|            2| 1995.732|   1821|   0| 12501| 13455|

Investigate correlations between dietary intake and B vitamins

Proportion of Participants Below B Vitamin Recommended Intake for Men (Hanna et al., 2022)

  Vitamin Proportion_Below
1      B1        0.4317301
2      B2        0.3513557
3      B6        0.2571821

Graphing B Vitamins vs Dietary Intake

## Individual plots are made because it looks better
create_scatter_plots <- function(data) {
  # List of nutrient variables to plot
  nutrient_vars <- c("Calories", "Carbohydrate", "Fat", "Protein")
  # List of vitamin variables to plot against
  vitamin_vars <- c("B1", "B2", "B6")
  
  # Define recommended values for B vitamins
  recommended_values <- list(
    B1 = 1.2,
    B2 = 1.3,
    B6 = 1
  )
  
  # Loop through each nutrient variable
  for (nutrient in nutrient_vars) {
    # Check if the nutrient exists in the data
    if (!nutrient %in% colnames(data)) {
      message(paste("Nutrient variable", nutrient, "not found in the data. Skipping."))
      next
    }
    
    # Loop through each vitamin variable
    for (vitamin in vitamin_vars) {
      # Check if vitamin exists in the data
      if (vitamin %in% colnames(data)) {
        # Prepare the data for modeling
        temp_data <- data[, c(nutrient, vitamin)]
        colnames(temp_data) <- c("Nutrient", "Vitamin")
        
        # Filter out missing and non-finite values
        temp_data <- temp_data |>
          na.omit() |>
          filter(is.finite(Nutrient) & is.finite(Vitamin))
        
        # Check if there are enough data points for modeling
        if (nrow(temp_data) > 1) {
          # Calculate the linear model and R^2
          model <- lm(Vitamin ~ Nutrient, data = temp_data)
          r_squared <- summary(model)$r.squared
          
          # Create the scatter plot with line of best fit
          scatter_plot <- ggplot(temp_data, aes(x = Nutrient, y = Vitamin)) +
            geom_point() +
            geom_smooth(method = "lm", se = FALSE, color = "blue") +  # Line of best fit
            labs(x = nutrient, y = vitamin) +
            theme_minimal() +
            ggtitle(paste("Scatter Plot of", vitamin, "vs", nutrient, "\nR² =", round(r_squared, 3)))
          
          # Add horizontal lines for recommended values
          scatter_plot <- scatter_plot +
            geom_hline(yintercept = recommended_values[[vitamin]], linetype = "dashed", color = "red", linewidth = 0.7) +
            annotate("text", x = max(temp_data$Nutrient, na.rm = TRUE), y = recommended_values[[vitamin]], 
                     label = "Men's Recommended Dietary Intake", vjust = -0.5, color = "red")
          
          # Print the scatter plot
          print(scatter_plot)
        } else {
          message(paste("Not enough data points for", vitamin, "vs", nutrient))
        }
      } else {
        message(paste("Vitamin variable", vitamin, "not found in the data. Skipping."))
      }
    }
  }
}

# Call the function with the data
create_scatter_plots(extracted_data)

Caloric Diet’s Effect on B vitamins

Conclusion

Overall, the average of pre-pandemic participants’ intended diets is associated with their true intake. All mean values for intake are right skewed by high outliers, so medians were assessed. The low sugar group averaged lower total sugars than those not on the diet (62.24 vs 90.89 g). Additionally, the low carbohydrate group ate less carbohydrates than those not on this diet (165.395 vs 219.08 g). On the other hand, those on a low fat diet ate more fats than those not on the diet (72.62 vs 71.805 g), meaning the average participant on a low fat diet did not eat less fats than other participants. The average high protein diet had substantially more protein intake than those not on the diet (100.61 vs 64.72 g). Finally, the average high calorie diet included substantially higher caloric intake than either the low calorie diet or those not on a caloric diet (2528 vs 1834 and 1821 cals), but the low calorie diet was actually slightly higher than those not on a caloric diet (1834 vs 1821 cals), meaning the average participant on a low calorie diet ate more calories than those not intending to calorically restrict. Although, all IQRs of dietary intake by intended diet are overlapping, so following an intended diet is variable for Americans pre-pandemic. This lack of conclusivity makes since considering dietary needs are relative to a person’s physiological demands.

B vitamins have a moderate association with dietary intake. B1 and B2 vitamins are more strongly associated with all of the macronutrient intakes than B6, with the highest correlation occurring between B1 and caloric intake an B1 and carbohydrate intake. This demonstrates that getting enough dietary nutrition is critical for B1 and B2 vitamin daily replenishment. Interestingly, caloric diet type’s association with the B vitamins was not consistent between the vitamins, indicating that more than just than dietary intention is necessary for sufficient B vitamin intake. With 43% of participants below the recommended B1 levels guidelines for men, 35% below the guidelines for B2 levels, and 26% below the guidelines for B6 levels, many participants should eat more macronutrients and overall calories to meet the body’s B vitamin demands. In conclusion, the average American 1 year or older is eating their intended macronutrient diet and their intake is moderately associated with B vitamin levels pre-pandemic.